22 research outputs found
Improving Automated Driving through Planning with Human Internal States
This work examines the hypothesis that partially observable Markov decision
process (POMDP) planning with human driver internal states can significantly
improve both safety and efficiency in autonomous freeway driving. We evaluate
this hypothesis in a simulated scenario where an autonomous car must safely
perform three lane changes in rapid succession. Approximate POMDP solutions are
obtained through the partially observable Monte Carlo planning with observation
widening (POMCPOW) algorithm. This approach outperforms over-confident and
conservative MDP baselines and matches or outperforms QMDP. Relative to the MDP
baselines, POMCPOW typically cuts the rate of unsafe situations in half or
increases the success rate by 50%.Comment: Preprint before submission to IEEE Transactions on Intelligent
Transportation Systems. arXiv admin note: text overlap with arXiv:1702.0085
Online algorithms for POMDPs with continuous state, action, and observation spaces
Online solvers for partially observable Markov decision processes have been
applied to problems with large discrete state spaces, but continuous state,
action, and observation spaces remain a challenge. This paper begins by
investigating double progressive widening (DPW) as a solution to this
challenge. However, we prove that this modification alone is not sufficient
because the belief representations in the search tree collapse to a single
particle causing the algorithm to converge to a policy that is suboptimal
regardless of the computation time. This paper proposes and evaluates two new
algorithms, POMCPOW and PFT-DPW, that overcome this deficiency by using
weighted particle filtering. Simulation results show that these modifications
allow the algorithms to be successful where previous approaches fail.Comment: Added Multilane sectio
Planning with SiMBA: Motion Planning under Uncertainty for Temporal Goals using Simplified Belief Guides
This paper presents a new multi-layered algorithm for motion planning under
motion and sensing uncertainties for Linear Temporal Logic specifications. We
propose a technique to guide a sampling-based search tree in the combined task
and belief space using trajectories from a simplified model of the system, to
make the problem computationally tractable. Our method eliminates the need to
construct fine and accurate finite abstractions. We prove correctness and
probabilistic completeness of our algorithm, and illustrate the benefits of our
approach on several case studies. Our results show that guidance with a
simplified belief space model allows for significant speed-up in planning for
complex specifications.Comment: 8 pages, to appear in the IEEE International Conference on Robotics
and Automation (ICRA), 202
Sampling-based Reactive Synthesis for Nondeterministic Hybrid Systems
This paper introduces a sampling-based strategy synthesis algorithm for
nondeterministic hybrid systems with complex continuous dynamics under temporal
and reachability constraints. We view the evolution of the hybrid system as a
two-player game, where the nondeterminism is an adversarial player whose
objective is to prevent achieving temporal and reachability goals. The aim is
to synthesize a winning strategy -- a reactive (robust) strategy that
guarantees the satisfaction of the goals under all possible moves of the
adversarial player. The approach is based on growing a (search) game-tree in
the hybrid space by combining a sampling-based planning method with a novel
bandit-based technique to select and improve on partial strategies. We provide
conditions under which the algorithm is probabilistically complete, i.e., if a
winning strategy exists, the algorithm will almost surely find it. The case
studies and benchmark results show that the algorithm is general and
consistently outperforms the state of the art.Comment: 9 pages, 9 figures, submitted to 62nd IEEE Conference on Decision and
Control 202
Sparse tree search optimality guarantees in POMDPs with continuous observation spaces
Partially observable Markov decision processes (POMDPs) with continuous state
and observation spaces have powerful flexibility for representing real-world
decision and control problems but are notoriously difficult to solve. Recent
online sampling-based algorithms that use observation likelihood weighting have
shown unprecedented effectiveness in domains with continuous observation
spaces. However there has been no formal theoretical justification for this
technique. This work offers such a justification, proving that a simplified
algorithm, partially observable weighted sparse sampling (POWSS), will estimate
Q-values accurately with high probability and can be made to perform
arbitrarily near the optimal solution by increasing computational power